home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Meeting Pearls 2
/
Meeting Pearls Vol. II (1995)(GTI - Schatztruhe)[!].iso
/
Pearls
/
dev
/
TurboM2
/
docs
/
OPTOZS.DOC
< prev
next >
Wrap
Text File
|
1995-01-24
|
4KB
|
111 lines
Compiler optimizations
======================
The compiler applies a few easy to implement optimizations to generate
small & fast code. The two three (as described below) give the Drystone
benchmark a considerable speed boost.
These optimizations increase compilation time, but because the compiler
compiles itself, the slowdown is more than made up for by the higher quality
code. The compiler consists of 14000 lines of (dense) code and compiles
itself in around 60 seconds on a 14Mhz MC68020.
These optimizations are also performed by the DICE compiler and assembler.
The commercial M2Amiga compiler used to bootstrap M2C, generates relatively
slower (and larger) code because it does not attempt to do any of the
following:
Register Allocation
-------------------
Simple local variables and parameters are allocated to registers based on
wieghted usage.
Parameters are copied off the stack and into registers at procedure entry.
Locals used inside nested loops are given highest priority.
Typically there are 6 data registers (d2-d7) and 4 address registers
(a2,a3,a5,a6) available.
Procedures that contain complex expressions may have less registers
available.
Low level procedures (that call no others) can also use d0,d1,a0,a1 to hold
locals.
Locals that are passed as VAR parameters to other procedures, arguments of
SYSTEM.ADR or locals referenced from within nested procedures
(so called non-local,non-global variables) cannot be allocated to registers.
LINK/UNLINK removal
-------------------
If the compiler manages to allocate all parameters and variables for
a particular procedure to registers.Then it can remove the LINK & UNLINK
instruction pair which normally constructs/destructs a stack frame for
variables in that procedure.
If this can be done the procedure call overhead is greatly reduced. This is
specially important in small procedures eg. when using opaque types we
are sometimes forced to use a function just to read the contents of a field.
CASE statement
--------------
Generating code for the case statements is quite tricky. The compiler must
adequately cope with a variety of styles:
CASE exp OF
|-42 : ..
|10000: .. (* This is bad style but is still legal. *)
END (* Some (inadequate) compilers will complain if you try this *)
or
--
CASE exp OF
|0: ..
|1: ..
|2: ..
|3: .. (* Case labels are contiguous *)
... (* This is really what the case statement is for *)
|100: ..
END ;
or a combination of the 2 styles.
The algorithm for the case statement is:
If there are only a few case labels (less than 8 say) then perform a linear
search.
If the number of case labels is a high proportion (atleat 50 percent say) of
MAX(label)-MIN(label) then perform a table lookup to see where to branch to.
Otherwise subdivide the labels (into 2 havles) and recursively apply the
algorithm.
The compiler generates code to implement this algorithm.
Branch Optimization
-------------------
The M68000 provides a 16bit as well as 32bit branch instruction.
When the compiler generates a forward branch its impossible for it to know
which to use.To be safe it generates space for a 32bit instruction, if
subsequently it turns out that only a 16bit instruction was needed, the code
will contain a 2 byte hole. The compiler keeps track of where these holes
occur and after code generation compacts all the code.
Branch optimization typically reduces code size by about 5 percent.
If you spend a few minutes thinking about it, youll realize that performing
branch optimization is more complex than it appears.
Adressing Models
----------------
The compiler supports (and defaults to) small data (A4 relative) &
small code (PC relative) memory models. These models reduce the size of the
relocation information in the final executable, i.e. its smaller and loads
faster.
Future optimizations
--------------------
The compiler does not implement registerized parameter passing, this may
be added. It doesn't do any 'global' optimizations either because I dont
know how :) This is only my second compiler, the first was for a toy
programming language (at college).